Adding Sanitizer interface and implementation #97
Adding Sanitizer interface and implementation #97longquanzheng wants to merge 23 commits intouber-java:masterfrom
Conversation
|
@alexeykudinkin Hi, can you take a look? I have addressed the comments from the previous PR. LMK if you have any others. |
|
looks like unit tests are failing. Working on fixing |
|
@alexeykudinkin Hi, can you take a look? I have fixed all the tests. Thanks in advance. |
|
@alexeykudinkin @andrewmains12 Hello, sorry to bother you guys again. I would really appreciate if you can help review it. |
|
@longquanzheng we would also need to write a benchmark for this one. |
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizerImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizerImpl.java
Outdated
Show resolved
Hide resolved
| * StringSanitizer is to sanitize strings | ||
| * It has a Sanitize method which returns a sanitized version of the input string value. | ||
| */ | ||
| public interface StringSanitizer { |
There was a problem hiding this comment.
I don't think we need standalone interface for this, Function interface should be reflective
There was a problem hiding this comment.
Yeah agreed. It's too heavy for a function to have interface.
You meant functional interface/lambda, correct?
There was a problem hiding this comment.
Changed to Function<String,String>
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizeRange.java
Outdated
Show resolved
Hide resolved
|
|
||
| private SanitizeRange(char low, char high) { | ||
| this.low = low; | ||
| this.high = high; |
There was a problem hiding this comment.
Please add precondition asserting that high >= low
There was a problem hiding this comment.
I don't see an example of "Precondition"(maybe I didn't find in a correct way) so I throw a Runtime exception here. LMK if you want to assert in a different way.
There was a problem hiding this comment.
Yeah, precondition reference was mostly notional (please do not import Guava for that)
|
@alexeykudinkin thanks so much for your reviews. I have addressed your comments except bench tests. Could you give me more about how to write/add it? |
| protected String separator = DEFAULT_SEPARATOR; | ||
| protected ImmutableMap<String, String> tags; | ||
| protected Buckets defaultBuckets = DEFAULT_SCOPE_BUCKETS; | ||
| protected ScopeSanitizer sanitizer = new ScopeSanitizerBuilder().build(); |
There was a problem hiding this comment.
Let's make this opt-in -- by default sanitizer should be no-op
There was a problem hiding this comment.
Do you mean having it as null by default in the builder?
This would be a little bit tricky. Then the ScopeImpl will have to do this null check everywhere, is that okay? (It would be easier if this is Kotlin and we can use ? operator, sigh)
There was a problem hiding this comment.
NVM, looks like I will apply another comment from you and Andrey in #97 (comment)
|
@longquanzheng LGTM, minor comment. Regarding benchmarks, you can take a look at the benchmarks package for some examples; |
| } | ||
|
|
||
| ImmutableMap.Builder<String, String> builder = new ImmutableMap.Builder<>(); | ||
| if (tags != null) { |
There was a problem hiding this comment.
nit: this check is redundant. If tags is null it will fail on tags.entrySet()
There was a problem hiding this comment.
I'd rather re-phrase this: @longquanzheng please move this check at the beginning to make sure there's no NPE
|
|
||
| private CharRange(char low, char high) { | ||
| if (low > high) { | ||
| throw new RuntimeException("invalid CharRange"); |
There was a problem hiding this comment.
nit: s/RuntimException/IllegalArgumentException?
| private final Function<String, String> keySanitizer; | ||
| private final Function<String, String> valueSanitizer; | ||
|
|
||
| SanitizerImpl(Function<String, String> nameSanitizer, Function<String, String> keySanitizer, Function<String, String> valueSanitizer) { |
There was a problem hiding this comment.
nit: I'd make arguments name more explicit: tagKeySanitizer, tagValueSanitizer
There was a problem hiding this comment.
yeah good catch -- I should have done it, hehe
| */ | ||
| public class ScopeSanitizerBuilder { | ||
|
|
||
| private Function<String, String> nameSanitizer = value -> value; |
There was a problem hiding this comment.
nit: consider Function.identity()
|
|
||
| /** | ||
| * The SanitizerBuilder returns a Sanitizer for the name, key and value. By | ||
| * default, the name, key and value sanitize functions returns all the input |
There was a problem hiding this comment.
Maybe create a NoopSanitizer implements ScopeSanitizer and use it as a default one instead?
There was a problem hiding this comment.
Yeah I like this idea better.
Otherwise having null in ScopeImpl will require lots of null checking in the code.
| @@ -0,0 +1,89 @@ | |||
| // Copyright (c) 2020 Uber Technologies, Inc. | |||
There was a problem hiding this comment.
pls update copyright to 2021
| import static org.junit.Assert.assertEquals; | ||
| import static org.junit.Assert.assertNotNull; | ||
|
|
||
| public class SanitizeRangeTest { |
There was a problem hiding this comment.
nit: s/SanitizeRangeTest/CharRangeTest
| private static final char HIGH = 'z'; | ||
|
|
||
| @Test | ||
| public void sanitizeRange() { |
There was a problem hiding this comment.
pls add a test when HIGH is smaller than LOW and maybe when HIGH == LOW
|
|
||
| import static org.junit.Assert.assertEquals; | ||
|
|
||
| public class SanitizerTest { |
There was a problem hiding this comment.
nit: s/SanitizerTest/SanitizerImplTest or ScopeSanitizerTest
There was a problem hiding this comment.
changed to ScopeSanitizerTest
| .withKeyCharacters( | ||
| ValidCharacters.of( | ||
| ValidCharacters.ALPHANUMERIC_RANGE, | ||
| ValidCharacters.UNDERSCORE_DASH_CHARACTERS)) |
There was a problem hiding this comment.
iirc dot is also a valid character but not recommended
There was a problem hiding this comment.
changed to UNDERSCORE_DASH_CHARACTERS
SokolAndrey
left a comment
There was a problem hiding this comment.
Since you're modifying ScopeImpl default behaviour (even though it's a noop by default) would be nice to see how does it affect the benchmarks.
You can run those benchmark tests using the following command:
./gradlew tally-core:runJmhTests
Could you run it without and with your changes and add both results to the PR?
It would be nice to have separate benchmarks for sanitizer implementation though, but I don't think it has to be a part of this PR. @alexeykudinkin wdyt?
you can find more info/examples on JMH benchmarks here
|
Thanks both for the review. I will address the comments this week. |
|
@alexeykudinkin @SokolAndrey thank you so much again for the reviewing. I have addressed all the comments. Below is the bench results on my laptop: Before the PR(using master) After the PR: My second run on current PR: |
|
@longquanzheng i assume these runs are w/ no-op sanitizer? Can you also paste Benchmarks w/ the sanitizer actually used? |
Yes. Thanks for pointing it out. |
Just ran with the commit of sanitizer: |
|
@longquanzheng i don't think we have any Benchmarks before targeting specifically code paths you're changing. Let's make sure that Benchmarks are stressing code paths that are changing. |
add benchmark code
I got what you mean. Just added benchmark code. Can you take another look? |
|
@alexeykudinkin here is the results:
|
| @@ -0,0 +1,4 @@ | |||
| Benchmark Mode Cnt Score Error Units | |||
| ScopeImplBenchmark.recordingBenchmark thrpt 10 74.819 ± 4.573 ops/ms | |||
There was a problem hiding this comment.
Please check how other benchmarks are reported (we need to capture all data-points offered by JMH)
There was a problem hiding this comment.
Sorry, I don't really understand how exactly to do it.
Would you mind taking it over? I am not convinced that this should be done by this PR, as the benchmark is already missing. The most problem is I don't have a clear picture of what you really want for benchmarking(with out detailed documentation), and it should be much easier if your team just do it.
LMK if you agree.
There was a problem hiding this comment.
It should be done by this PR. This is a critical library laying in the hot-path of execution of many services, hence we need to maintain the focus on its performance.
Over the last year a lot of efforts have been put in to optimize, streamline, and make this library robust to serve the needs it was originally built for. Therefore, as a new contribution guideline we require any non-trivial change to adhere to the same basic principles of assuring library's correctness and performance.
Totally appreciate the amount of incremental effort that is required to adhere to this heightened standards from every contribution, but unfortunately there's no other way to guarantee high-level of reliability and performance of the open-source library otherwise.
There was a problem hiding this comment.
Please check how other benchmarks are reported
Can you give more details here? I did check but don't know how that works(though I see other reports have the details like gc time).
For example another benchmark is just
public class M3ReporterBenchmark extends AbstractReporterBenchmark. So I don't know what am I missing here.
| .reportEvery(Duration.MAX_VALUE); | ||
| } | ||
|
|
||
| public void recordTestMetrics(final ScopeImpl scope) { |
There was a problem hiding this comment.
I don't think this is a good routine to check -- this was used as pre-init seq, and i would suggest to keep it as such.
Instead create separate benchmarks and routines to update counter/gauge/histogram separately.
There was a problem hiding this comment.
Same as my pervious response. I appreciate your time reviewing and commenting it.
But it would be nice if you can help the benchmarking.
There was a problem hiding this comment.
Please check my comment above. We're more than happy to help w/ guidance, but benchmarking is now a standard requirement from non-trivial contributions.
There was a problem hiding this comment.
Sure, will revert that one back.
Can you give me details about what you want to benchmark? -- which method/function/routines?
There was a problem hiding this comment.
As i have called out prior, let's test creation and recording fro each type of metric individually (counter/gauge/histogram).
// Counter
scope.counter("counter").inc(1)
// Timer
scope.timer("timer").record(...)
// Histogram
...
This is replacement of #70